Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
PLoS One ; 19(2): e0299169, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38422081

RESUMO

Prokaryotic chromosomes contain numerous small open reading frames (ORFs) of less than 200 bases. Since high-throughput proteomics methods often miss proteins containing fewer than 60 amino acids, it is difficult to decern if they encode proteins. Recent studies have revealed that many small proteins are membrane proteins with a single membrane-anchoring α-helix. As membrane anchoring or transmembrane motifs are accurately identifiable with high confidence using computational algorithms like Phobius and TMHMM, small membrane proteins (SMPS) can be predicted with high accuracy. This study employed a systematic approach, utilizing well-verified algorithms such as Orfipy, Phobius, and Blast to identify SMPs in prokaryotic organisms. Our main search parameters targeted candidate SMPs with an open reading frame between 60-180 nucleotides, a membrane-anchoring or transmembrane region 15 and 30 amino acids long, and sequence conservation among other microorganisms. Our findings indicate that each prokaryote possesses many SMPs, with some identified in the intergenic regions of currently annotated chromosomes. More extensively studied microorganisms, such as Escherichia coli and Bacillus subtilis, have more SMPs identified in their genomes compared to less studied microorganisms, suggesting the possibility of undiscovered SMPs in less studied microorganisms. In this study, we describe the common SMPs identified across various microorganisms and explore their biological roles. We have also developed a software pipeline and an accompanying online interface for discovering SMPs (http://cs.indstate.edu/pro-smp-finder). This resource aims to assist researchers in identifying new SMPs encoded in microbial genomes of interest.


Assuntos
Antifibrinolíticos , Proteínas de Membrana , Proteínas de Membrana/genética , Membranas , Algoritmos , Aminoácidos , Escherichia coli/genética
2.
Mol Ecol Resour ; 18(3): 590-601, 2018 May.
Artigo em Inglês | MEDLINE | ID: mdl-29455464

RESUMO

Different second-generation sequencing technologies may have taxon-specific biases when DNA metabarcoding prey in predator faeces. Our major objective was to examine differences in prey recovery from bat guano across two different sequencing workflows using the same faecal DNA extracts. We compared results between the Ion Torrent PGM and the Illumina MiSeq with similar library preparations and the same analysis pipeline. We focus on repeatability and provide an R Notebook in an effort towards transparency for future methodological improvements. Full documentation of each step enhances the accessibility of our analysis pipeline. We tagged DNA from insectivorous bat faecal samples, targeted the arthropod cytochrome c oxidase I minibarcode region and sequenced the product on both second-generation sequencing platforms. We developed an analysis pipeline with a high operational taxonomic unit (OTU) clustering threshold (i.e., ≥98.5%) followed by copy number filtering to avoid merging rare but genetically similar prey into the same OTUs. With this workflow, we detected 297 unique prey taxa, of which 74% were identified at the species level. Of these, 104 (35%) prey OTUs were detected by both platforms, 176 (59%) OTUs were detected by the Illumina MiSeq system only, and 17 (6%) OTUs were detected using the Ion Torrent system only. Costs were similar between platforms but the Illumina MiSeq recovered six times more reads and four additional insect orders than did Ion Torrent. The considerations we outline are particularly important for long-term ecological monitoring; a more standardized approach will facilitate comparisons between studies and allow faster recognition of changes within ecological communities.


Assuntos
Quirópteros/fisiologia , Código de Barras de DNA Taxonômico/métodos , Fezes/química , Animais , Classificação/métodos , Código de Barras de DNA Taxonômico/normas , Dieta , Comportamento Alimentar , Reprodutibilidade dos Testes , Fluxo de Trabalho
3.
BMC Bioinformatics ; 18(Suppl 11): 382, 2017 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-28984182

RESUMO

BACKGROUND: It is generally thought that most canonical or non-canonical splicing events involving U2- and U12 spliceosomes occur within nuclear pre-mRNAs. However, the question of whether at least some U12-type splicing occurs in the cytoplasm is still unclear. In recent years next-generation sequencing technologies have revolutionized the field. The "Read-Split-Walk" (RSW) and "Read-Split-Run" (RSR) methods were developed to identify genome-wide non-canonical spliced regions including special events occurring in cytoplasm. As the significant amount of genome/transcriptome data such as, Encyclopedia of DNA Elements (ENCODE) project, have been generated, we have advanced a newer more memory-efficient version of the algorithm, "Read-Split-Fly" (RSF), which can detect non-canonical spliced regions with higher sensitivity and improved speed. The RSF algorithm also outputs the spliced sequences for further downstream biological function analysis. RESULTS: We used open access ENCODE project RNA-Seq data to search spliced intron sequences against the U12-type spliced intron sequence database to examine whether some events could occur as potential signatures of U12-type splicing. The check was performed by searching spliced sequences against 5'ss and 3'ss sequences from the well-known orthologous U12-type spliceosomal intron database U12DB. Preliminary results of searching 70 ENCODE samples indicated that the presence of 5'ss with U12-type signature is more frequent than U2-type and prevalent in non-canonical junctions reported by RSF. The selected spliced sequences have also been further studied using miRBase to elucidate their functionality. Preliminary results from 70 samples of ENCODE datasets show that several miRNAs are prevalent in studied ENCODE samples. Two of these are associated with many diseases as suggested in the literature. Specifically, hsa-miR-1273 and hsa-miR-548 are associated with many diseases and cancers. CONCLUSIONS: Our RSF pipeline is able to detect many possible junctions (especially those with a high RPKM) with very high overall accuracy and relative high accuracy for novel junctions. We have incorporated useful parameter features into the pipeline such as, handling variable-length read data, and searching spliced sequences for splicing signatures and miRNA events. We suggest RSF, a tool for identifying novel splicing events, is applicable to study a range of diseases across biological systems under different experimental conditions.


Assuntos
Algoritmos , Processamento Alternativo/genética , Genoma , Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Humanos , Íntrons/genética , MicroRNAs/genética , MicroRNAs/metabolismo , Sítios de Splice de RNA/genética
4.
BMC Genomics ; 17 Suppl 7: 503, 2016 08 22.
Artigo em Inglês | MEDLINE | ID: mdl-27556805

RESUMO

BACKGROUND: Most existing tools for detecting next-generation sequencing-based splicing events focus on generic splicing events. Consequently, special types of non-canonical splicing events of short mRNA regions (IRE1α targeted) have not yet been thoroughly addressed at a genome-wide level using bioinformatics approaches in conjunction with next-generation technologies. During endoplasmic reticulum (ER) stress, the gene encoding the RNase Ire1α is known to splice out a short 26 nt region from the mRNA of the transcription factor Xbp1 non-canonically within the cytosol. This causes an open reading frame-shift that induces expression of many downstream genes in reaction to ER stress as part of the unfolded protein response (UPR). We previously published an algorithm termed "Read-Split-Walk" (RSW) to identify non-canonical splicing regions using RNA-Seq data and applied it to ER stress-induced Ire1α heterozygote and knockout mouse embryonic fibroblast cell lines. In this study, we have developed an improved algorithm "Read-Split-Run" (RSR) for detecting genome-wide Ire1α-targeted genes with non-canonical spliced regions at a faster speed. We applied the RSR algorithm using different combinations of several parameters to the previously RSW tested mouse embryonic fibroblast cells (MEF) and the human Encyclopedia of DNA Elements (ENCODE) RNA-Seq data. We also compared the performance of RSR with two other alternative splicing events identification tools (TopHat (Trapnell et al., Bioinformatics 25:1105-1111, 2009) and Alt Event Finder (Zhou et al., BMC Genomics 13:S10, 2012)) utilizing the context of the spliced Xbp1 mRNA as a positive control in the data sets we identified it to be the top cleavage target present in Ire1α (+/-) but absent in Ire1α (-/-) MEF samples and this comparison was also extended to human ENCODE RNA-Seq data. RESULTS: Proof of principle came in our results by the fact that the 26 nt non-conventional splice site in Xbp1 was detected as the top hit by our new RSR algorithm in heterozygote (Het) samples from both Thapsigargin (Tg) and Dithiothreitol (Dtt) treated experiments but absent in the negative control Ire1α knock-out (KO) samples. Applying different combinations of parameters to the mouse MEF RNA-Seq data, we suggest a General Linear Model (GLM) for both Tg and Dtt treated experiments. We also ran RSR for a human ENCODE RNA-Seq dataset and identified 32,597 spliced regions for regular chromosomes. TopHat (Trapnell et al., Bioinformatics 25:1105-1111, 2009) and Alt Event Finder (Zhou et al., BMC Genomics 13:S10, 2012) identified 237,155 spliced junctions and 9,129 exon skipping events (excluding chr14), respectively. Our Read-Split-Run algorithm also outperformed others in the context of ranking Xbp1 gene as the top cleavage target present in Ire1α (+/-) but absent in Ire1α (-/-) MEF samples. The RSR package including source codes is available at http://bioinf1.indstate.edu/RSR and its pipeline source codes are also freely available at https://github.com/xuric/read-split-run for academic use. CONCLUSIONS: Our new RSR algorithm has the capability of processing massive amounts of human ENCODE RNA-Seq data for identifying novel splice junction sites at a genome-wide level in a much more efficient manner when compared to the previous RSW algorithm. Our proposed model can also predict the number of spliced regions under any combinations of parameters. Our pipeline can detect novel spliced sites for other species using RNA-Seq data generated under similar conditions.


Assuntos
Processamento Alternativo/genética , Sequência de Bases/genética , Genoma , Sítios de Splice de RNA/genética , Splicing de RNA/genética , Algoritmos , Animais , Biologia Computacional/métodos , Proteínas de Ligação a DNA/genética , Bases de Dados Genéticas , Genômica/métodos , Humanos , Camundongos , Software , Resposta a Proteínas não Dobradas/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...